Optimal Task Ordering in Chain Data Flows: Exploring the Practicality of Non-scalable Solutions
نویسندگان
چکیده
Modern data flows generalize traditional Extract-TransformLoad and data integration workflows in order to enable end-to-end data processing and analytics. The more complex they become, the more pressing the need for automated optimization solutions. Optimizing data flows comes in several forms, among which, optimal task ordering is one of the most challenging ones. We take a practical approach; motivated by real-world examples, such as those captured by the TPC-DI benchmark, we argue that exhaustive non-scalable solutions are indeed a valid choice for chain flows. Our contribution is that we thoroughly discuss the three main directions for exhaustive enumeration of task ordering alternatives, namely backtracking, dynamic programming and topological sorting, and we provide concrete evidence up to which size and level of flexibility of chain flows they can be applied.
منابع مشابه
Prioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملMulti-objective robust optimization model for social responsible closed-loop supply chain solved by non-dominated sorting genetic algorithm
In this study a supply chain network design model has been developed considering both forward and reverse flows through the supply chain. Total Cost, environmental factors such as CO2 emission, and social factors such as employment and fairness in providing job opportunities are considered in three separate objective functions. The model seeks to optimize the facility location proble...
متن کاملA single-vendor and a single-buyer integrated inventory model with ordering cost reduction dependent on lead time
Lead time is one of the major limits that affect planning at every stage of the supply chain system. In this paper, we study a continuous review inventory model. This paper investigates the ordering cost reductions are dependent on lead time. This study addressed two-echelon supply chain problem consisting of a single vendor and a single buyer. The main contribution of this study is that the in...
متن کاملDetermination of Material Flows in a Multi-echelon Assembly Supply Chain
This study aims to minimize the total cost of a four-echelon supply chain including suppliers, an assembler, distributers, and retailers. The total cost consists of purchasing raw materials from the suppliers by the assembler, assembling the final product, materials transportation from the suppliers to the assembler, product transportation from the assembler to the distributors, product transpo...
متن کاملCoordinating Pricing and Ordering Decisions in a Multi-Echelon Pharmacological Supply Chain under Different Market Power using Game Theory
The importance of supply chains in pharmacological industry is remarkable so that nowadays many pharmacological supply chains have an effective and critical role for supplying and distributing drugs in health area. So, this study studies a three-echelon pharmacological supply chain contained multi-distributor of raw materials, a pharmaceutical factory, and multi-drug distributors companies such...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017